Elimination of trajectory folding phenomenon: HMM, trajectory mixture HMM and mixture stochastic trajectory model
نویسندگان
چکیده
In this paper, a study of topology of Hidden Markov Model (HMM) used in speech recognition is addressed. Our main contribution is the introduction of the notion of trajectory folding phenomenon of HMM. In complex phonetic contexts and in speaker-variability, this phenomenon degrades the discriminability of HMM. The goal of this paper is to give some explanation and experimental evidence suggesting the existence of this phenomenon. The systems eliminating (partially or entirely) the trajectory folding are HMM with a special topology, called Trajectory Mixture HMM (TMHMM), and a Mixture Stochastic Trajectory Model (MSTM), proposed recently. HMM, TMHMM and MSTM have been tested on a 1011 words vocabulary, speaker dependent and multi-speaker continuous French speech recognition task. With similar number of model parameters, TMHMM and MSTM cuts down the error rate produced by the HMM, which confirms our hypothesis.
منابع مشابه
Trajectory Clustering Using Longer Length Units for Automatic Speech Recognition
One of the major deficiencies of conventional hidden Markov modelling (HMM) is known as the trajectory folding phenomenon. Multipath Models can solve the trajectory folding problem by assuming that a large part of the variation in acoustic data can be attributed to different observation classes and which can then be modelled separately. In this paper, we present an approach to automatically clu...
متن کاملModulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis
This paper presents a novel training algorithm for Hidden Markov Model (HMM)-based speech synthesis. One of the biggest issues causing significant quality degradation in synthetic speech is the over-smoothing effect often observed in generated speech parameter trajectories. Recently, we have found that a Modulation Spectrum (MS) of the generated speech parameters is sensitively correlated with ...
متن کاملNATIONAL UNIVERSITY OF SINGAPORE School of Computing PH.D DEFENCE - PUBLIC SEMINAR
Automatic Speech Recognition (ASR) has been one of the most popular research areas in computer science. Many state-of-the-art ASR systems still use the Hidden Markov Model (HMM) for acoustic modelling due to its efficient training and decoding. HMM state output probability of an observation is assumed to be independent of the other states and the surrounding observations. Since temporal correla...
متن کاملCoarticulation modeling by embedding a target-directed hidden trajectory model into HMM - MAP decoding and evaluation
The Hidden Dynamic Model (HDM) has been an attractive acoustic modeling approach because it provides a computational model for coarticulation and the dynamics of human speech. However, the lack of a direct decoding algorithm has been a barrier to research progress on HDM. We have developed a new HDM-based acoustic model, the Hidden-Trajectory HMM (HTHMM), which combines the state/mixture topolo...
متن کاملTrajectory modeling based on HMMs with the explicit relationship between static and dynamic features
This paper shows that the HMM whose state output vector includes static and dynamic feature parameters can be reformulated as a trajectory model by imposing the explicit relationship between the static and dynamic features. The derived model, named trajectory HMM, can alleviate the limitations of HMMs: i) constant statistics within an HMM state and ii) independence assumption of state output pr...
متن کامل